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Section 24 

Empirical process and Kolmogorov's 
chaining. 

Empirical process and the Kolmogorov-Smirnov test. In this sections we show how the Brownian bridge 
arises in another central limit theorem on the space of continuous functions on [0,1]. Let us start with a 
motivating example from statistics. Suppose that xi, . . . , a:„ are i.i.d. uniform random variables on [0, 1]. By 
the law of large numbers, for any t G [0, 1] , the empirical c.d.f. J2"=i < t) converges to the true c.d.f. 
P(a;i <t) = t almost surely and, moreover, by the CLT, 

Xt=V^{-T.^ixi<t)-t) ^Af{0,t{l-t)). 

i=l 

The stochastic process X" is called the empirical process. The covariancc of this process, 

EX^X^ = E{l{xi <t)- t){I{xi < s) - s) = s - is - ts + ts = s{l - t), 

is the same as the covariancc of the Brownian bridge and, by the multivariate CLT, finite dimensional 
distributions of the empirical process converge to f.d. distributions of the Brownian bridge, 

c{iX^)teF)^c[{Bt)teF). (24.0.1) 

However, we would like to show the convergence of X" to Bt in some stronger sense that would imply weak 
convergence of continuous functions of the process on the space (C[0, 1], \\ ■ \\oo)- 

The Kolmogorov-Smirnov test in statistics provides one possible motivation. Suppose that i.i.d. (Xi)i>i 
have continuous distribution with c.d.f. F{t) = F{Xi < t). Let Fn{t) = n^^ S"=i K^i ^ t) be the empirical 
c.d.f. It is easy to see the equality in distribution 

supV^\F^{t)-F{t)\= sup m 
teM te[o,i] 

because F(Xi) have uniform distribution on [0, 1]. In order to test whether {Xi)i>i come from the distribution 
with c.d.f. F, the statisticians need to know the distribution of the above supremum or, as approximation, 
the distribution of its limit. Equation (24.0.1) suggests that 

£(sup |X("|) ^ £(sup \Bt\). (24.0.2) 
t t 

Since Bt is sample continuous, its distribution is the law on the metric space (C[0, 1], j] • Hoc)- Even though 
X" is not continuous, its jumps are of order n~^/^ so it has a "close" continuous version Y^. Since || • ||oo is a 
continuous functional on C[0, 1], (24.0.2) would hold if we can prove weak convergence C{Y^)^C{Bt) on the 
space (C[0, 1], || • \\oo)- Lemma 36 in Section 18 shows that we only need to prove uniform tightness of jC{Y^) 
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because, by Lemma 45, (24.0.1) already identifies the law of the Brownian motion as the unique possible 
limit. Thus, we need to address the question of uniform tightness of (£(X")) on the complete separable 
space (C[0, 1], || • \\^) or equivalently, by the result of the previous section, the equicontinuity of X", 



lim limsupP(TO(X", J) > e 

5— >0 ji_>oo ^ 



0. 



By Chebyshev's inequality, 



and we need to learn how to control Em(X", 5). The modulus of continuity of X" can be written as 



miX\6)^ sup \X^-X^ 

\t-s\<5 



I 1 

/n sup — 7 l{s < Xi < t) — {t — s) 
|t_s|<5ln^ 



\/n sup - ^ f{xi) - Ef 



where we introduced the class of functions 

jr = I = I(s < a; < i) : \t-s\< 



(24.0.3) 



(24.0.4) 



We will develop one approach to control the expectation of (24.0.3) for general classes of functions !F and 
we will only use the specific definition (24.0.4) at the very end. This will be done in several steps. 

Symmetrization. At the first step, we will replace the empirical process (24.0.3) by a symmetrized 
version, called Rademacher process, that will be easier to control. Let x[,. . . ,x'^ be independent copies of 
.ri, . . . , Xn and let £i, . . . ,e„ be i.i.d. Rademacher j:a,ndom variables, such that P(e, = 1) = P(£i = —1) = 1/2. 
Let us define 



Notice that EP^/ = E/. Consider the random variables 



Z = sup\F„f -Ef\ and i? = sup - £i/(a;i) 



Then, using Jensen's inequality and then triangle inequality, we can write 
EZ 



Esup P„/-E/ 



E sup 



n/-Ep;/| 

I 1 I 1 " 

< Esup -^(/(x.)-/(a;:)) = E sup - ^e,(/(a;,) - M)) 



i=l 



< Esupl- V£i/(xi) +Esup|- V£j(x')| = 2Ei?. 



Equality in the second line holds because switching Xi ^ x[ arbitrarily does not change the expectation, so 
the equality holds for any fixed (£i) and, therefore, for any random (£j). 

□ 

HoefFding's inequality. The first step to control the supremum in R is to control the sum ^11=1 ^ifi^i) 
for a fixed function /. Consider an arbitrary sequence ai, . . . , a„ € M. Then the following holds. 



Theorem 57 (Hoeffding) For t > 0, 
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Proof. Given A > 0, by Chebyshev's inequality, 

n n n 

Using the inequality (e^ + e~^)/2 < e^^/^, we get 

Eexp(A£iaij = ^ < exp(^— ^ j. 

Hence, 

P(5^£iai > t) < exp(-At + y ^a?) 

i=l i=l 

and minimizing over A > gives the result. 

□ 

Covering numbers, Kolmogorov's chaining and Dudley's entropy integral. To control Ei? for 
general classes of functions J^, we will need to use some measures of complexity of J^. First, we will show 
how to control the Rademacher process R conditionally on a;i, . . . , a;„. 

Definition. Suppose that {F, d) is a totally bounded metric space. For any u > 0, a w-packing number 
of F with respect to d is defined by 

D{F,u,d) = maxcard|F„ C F : d{f,g) > u for all f,g & F„| 

and a u-covering number is defined by 

N{D,u,d) = mincardji^, ^ F -.yf e F 3 g G Fu such that d{f,g) < uj. 

□ 

Both packing and covering numbers measure how many points are needed to approximate any element in 
the set F within distance u. It is a simple exercise to show that 

N{F,u,d) < D{F,u,d) < N{F,u/2,d) 

and, in this sense, packing and covering numbers are closely related. Let F be a subset of the cube [—1,1]" 
equipped with a scaled Euclidean metric 



1 

Consider the following Rademacher process on F, 

1 " 



Then we have the following version of the classical Kolmogorov's chaining lemma. 

Theorem 58 (Kolmogorov's chaining) For any u > 0, 

p(v/eF, n{f)<2^/^ J log^/'^ D{F,e,d)de + 2'^/'^d{0,f)y/uj > 1 - e"". 

Proof. Without loss of generality, assume that G F. Define a sequence of subsets 

{0} = FqC Fi...C Fj C ...C F 

such that Fj satisfies 
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1. \/f,geFj, d{f,g)>2-i, 

2. V/ e F we can find g e Fj such that d{f,g) < 2'^. 

Fq obviously satisfies these properties for j = 0. To construct -Fj+i given Fj: 



• Start with Fj+i := Fj. 



• If possible, find f e F such that d{f,g) > 2 (■'+^) for all g e Fj+i. 

• Let Fjj^i := -Fj+i U {/} and repeat until you cannot find such /. 
Define projection wj : F ^ Fj as follows: 

for / e F find g G Fj with d{f,g) < 2'^ and set ■Kj{f) = g. 
Any f G F can be decomposed into the telescopic series 

/ = ^o(/) + (^i(/)-^o(/)) + (7r2(/)-7ri(/)) + ... 

= E(^.(/)-^.-i(/))- 

Moreover, 

d(7r,_i(/),7r,(/)) < d(7r,_i(/),/) + d(/,^,(/)) 

< 2" (•'-I) + 2"^' = 3 • < 2-^+"^. 

As a result, the jth term in the telescopic series for any f & F belongs to a finite set of possible links 

Lj_^j = {f-g:feFj,ge Fj_ud{f,g) < 2-^+^]. 

Since 7^(/) is linear, 

oo 

We first show how to control 1Z on the set of all links. Assume that i G Lj-ij. By Hoeffding's inequality. 

If |F| denotes the cardinality of the set F then 

|L,_i,,-|<|i^,_i|.|i^,-|<|F,f 

and, therefore, 

P(V£ e L,_i,„ 7^W <t)>l- |F,pexp(-24+^) = 1 - ^e-" 
after making a change of variables 

1/2 



Hence, 



t = (2-2^+^(41og + u)) < 2^/22-i iogi/2 \p.\ + 2^/22-^ v^. 
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If -Fj-1 = Fj then wc can define 7rj_i(/) = TTj{f) and, since in this case = {0}, there is no need to 

control these Unks. Therefore, we can assume that < \Fj\ and taking a union bound for all steps, 

P(vi > 1 e Lj_ij, n{£) < iogV2 |^^.| + j 

= 1 - (TrVe - l)e-" > 1 - e"". 



> 



Given f & F, let integer fc be such that 2 C^+i) < d(0, /) < 2 Then in the above construction we can 
assume that 7ro(/) = . . . = 7r/s(/) = 0, i.e. we will project / on if possible. Then with probability at least 

1 - e-", 

OO 



j=k+l 



< 



OO 



< 2^/2 ^ 2-nog'^^D{F,2-^,d) + 2^/^2-''^. 

j=k+l 

Note that 2-*^ < 2d(/,0) and 2^/22"'= < 2^/2d(/,0). Finally, since packing numbers D{F, e, d) are decreasing 
in £, we can write (see figure 24) 



j=k+l 



29/2 2-(j'+i)logi/2/?(i^,2-^d) < 29/M logi/2D(i;^,£,d)d£ 

/•d(0,/) 

< 2^/2 / log^/'^ D{F,s,d)ds (24.0.5) 

JO 



since 2-('=+i) < d{0, /). This finishes the proof. 



logD 



2 2 



The integral in (24.0.5) is called Dudley's entropy integral. We would like to apply the bound of the above 
theorem to 



1 

y/nR= sup --='^£if{Xi) 



for a class of functions in (24.0.4). Suppose that xi,. . . ,Xn € [0, 1] are fixed and let 

^=|(/0i<.<„= (l(s<a;,<t)) : |i-s|<(5andi,se[0,l]|c{0,l}". 

— — V / l<i<n J 

Then the following holds. 
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Lemma 47 N{F, u, d) < Ku * for some absolute K >Q independent of the points xi,. . . , a;„. 
Proof. We can assume that x\ < . . . < Xn- Then the class F consists of all vectors of the type 

(0...1...1...0), 

i.e. the coordinates equal to 1 come in blocks. Given u, let be a subset of such vectors with blocks of I's 
starting and ending at the coordinates k \nu\ . Given any vector f € F, let us approximate it by a vector in 
/' e Fu by choosing the closest starting and ending coordinates for the blocks of I's. The number of different 
coordinates will be bounded by 2 [nu\ and, therefore, the distance between / and /' will be bounded by 



d{f, /') < ^/2n-^[nu\ < V2u. 
The cardinality of F^ is, obviously, of order u~^. This proves that N{F, v^u, d) < Ku~^. Making the change 



of variables v2w ^ u proves the result. 

□ 

To apply the Kolmogorov chaining bound to this class F let us make a simple observation that if a random 
variable X > satisfies P(X >a+ht)< Ke'*' for alH > then 

poo roo roo 2 

EX = ¥{X>t)dt<a+ ¥{X>a + t)dt<a + K e~*^ dt < a + Kb < K{a + b). 
Jo Jo Jo 

Theorem 58 then implies that 



Eg sup 

F 



1 / f / K 

-^y^Sifi <k( Jlog-du + D„ 

Vo V u 

where is the expectation with respect to (e,) only and 

^ 1 ^ 
= supd(0,/)2 = sup - V = sup -Vl(s<a;i<t) 

" =i \t-s\<sn'^^ 

n 1 ^ 

-y^l{x^ < t) - - Vl(a;^ < s) . 



(24.0.6) 



= sup 

|t-s|<i5 

Since the integral on the right hand side of (24.0.6) is concave in Z)„, by Jensen's inequality, 

^^^P ^ ^(/^ yiog-rfw + E£)„j. 

By the symmetrization inequality, this finally proves that 

^Jlog—du + ED„y 



The strong law of large numbers easily implies that 



sup 

te[o,i] 



1 " 

- y <t)-t 



a.s. 



and, therefore, D"^ 5 a.s. and Ei3„ ^fS. This implies that 



lim sup Em{X^,S) <k( f J log— du + Vs) . 

n— ►00 ^Jo \ U J 



The right-hand side goes to zero as 5 ^ and this finishes the proof of equicontinuity of X". As a result, 
for any continuous function $ on (C[0, 1], || • the distibution of $(X'") converges to the distribution of 
^{Bt). For example, 

-Y^Kx, <t)-t 



'n sup 
o<t<i\n ■ 



sup \Bt\ 

0<t<l 



in distribution. We will find the distribution of the right hand side in the next section. 
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